Goto

Collaborating Authors

 Containers & Packaging


Semantic-Metric Bayesian Risk Fields: Learning Robot Safety from Human Videos with a VLM Prior

Chen, Timothy, Dominguez-Kuhne, Marcus, Swann, Aiden, Liu, Xu, Schwager, Mac

arXiv.org Artificial Intelligence

Humans interpret safety not as a binary signal but as a continuous, context- and spatially-dependent notion of risk. While risk is subjective, humans form rational mental models that guide action selection in dynamic environments. This work proposes a framework for extracting implicit human risk models by introducing a novel, semantically-conditioned and spatially-varying parametrization of risk, supervised directly from safe human demonstration videos and VLM common sense. Notably, we define risk through a Bayesian formulation. The prior is furnished by a pretrained vision-language model. In order to encourage the risk estimate to be more human aligned, a likelihood function modulates the prior to produce a relative metric of risk. Specifically, the likelihood is a learned ViT that maps pretrained features, to pixel-aligned risk values. Our pipeline ingests RGB images and a query object string, producing pixel-dense risk images. These images that can then be used as value-predictors in robot planning tasks or be projected into 3D for use in conventional trajectory optimization to produce human-like motion. This learned mapping enables generalization to novel objects and contexts, and has the potential to scale to much larger training datasets. In particular, the Bayesian framework that is introduced enables fast adaptation of our model to additional observations or common sense rules. We demonstrate that our proposed framework produces contextual risk that aligns with human preferences. Additionally, we illustrate several downstream applications of the model; as a value learner for visuomotor planners or in conjunction with a classical trajectory optimization algorithm. Our results suggest that our framework is a significant step toward enabling autonomous systems to internalize human-like risk. Code and results can be found at https://riskbayesian.github.io/bayesian_risk/.


ConceptBot: Enhancing Robot's Autonomy through Task Decomposition with Large Language Models and Knowledge Graph

Leanza, Alessandro, Moroncelli, Angelo, Vizzari, Giuseppe, Braghin, Francesco, Roveda, Loris, Spahiu, Blerina

arXiv.org Artificial Intelligence

--ConceptBot is a modular robotic planning framework that combines Large Language Models and Knowledge Graphs to generate feasible and risk-aware plans despite ambiguities in natural language instructions and correctly analyzing the objects present in the environment--challenges that typically arise from a lack of commonsense reasoning. T o do that, ConceptBot integrates (i) an Object Property Extraction (OPE) module that enriches scene understanding with semantic concepts from ConceptNet, (ii) a User Request Processing (URP) module that disambiguates and structures instructions, and (iii) a Planner that generates context-aware, feasible pick-and-place policies. In comparative evaluations against Google SayCan, ConceptBot achieved 100% success on explicit tasks, maintained 87% accuracy on implicit tasks (versus 31% for SayCan), reached 76% on risk-aware tasks (versus 15%), and outperformed SayCan in application-specific scenarios, including material classification (70% vs. 20%) and toxicity detection (86% vs. 36%). On SafeAgentBench, ConceptBot achieved an overall score of 80% (versus 46% for the next-best baseline). These results, validated in both simulation and laboratory experiments, demonstrate ConceptBot's ability to generalize without domain-specific training and to significantly improve the reliability of robotic policies in unstructured environments. Advances in recent decades in robotic core capabilities, i.e., perception, control, and manipulation, have increased demand for autonomous systems in fields ranging from manufacturing to healthcare, logistics to home care, etc. These capabilities are deeply interconnected with the planning phase [1], as successful planning depends on a robot's ability to perceive its environment accurately, execute precise control, and perform effective manipulation. Despite significant progress, planning in robotic systems continues to face challenges, particularly in unstructured environments [2]. A key element in achieving effective planning is task decomposition [3], which involves breaking complex objectives into smaller, manageable actions. This process is essential for simplifying execution and ensuring flexibility in diverse environments. Traditional task decomposition approaches, however, often rely on rigid, pre-programmed templates or static models, which struggle to adapt to unfamiliar or dynamic conditions [4]-[7]. Recently, advancements in Large Language Models (LLMs) have introduced a more dynamic alternative. LLMs enable robots to process natural language instructions, understand contextual nuances, and dynamically decompose tasks into actionable steps [8]-[10]. However, directly employing pre-trained LLMs often leads to non-executable or ineffective plans, as these models struggle to account for domain-specific constraints and real-world feasibility [11]- [13].


Metals can be squeezed into sheets just a few atoms thick

New Scientist

Sheets of metal just two atoms thick can be produced by squashing molten droplets at great pressure between two sapphires. The researchers who developed the process say the unusual materials could have applications in industrial chemistry, optics and computers. Last year, scientists created a gold sheet that was a single atom thick, which they dubbed "goldene" after graphene, a material made of a single layer of carbon atoms. Such materials have been described as two-dimensional, as they are as thin as chemically possible. But making other 2D metals hadn't been possible until now. The new technique, developed by Luojun Du at the Chinese Academy of Sciences and his colleagues, can create 2D sheets of bismuth, gallium, indium, tin and lead that are as thin as their atomic bonds allow.

  Industry: Materials > Containers & Packaging (0.40)

Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback

Furuta, Hiroki, Zen, Heiga, Schuurmans, Dale, Faust, Aleksandra, Matsuo, Yutaka, Liang, Percy, Yang, Sherry

arXiv.org Artificial Intelligence

Large text-to-video models hold immense potential for a wide range of downstream applications. However, these models struggle to accurately depict dynamic object interactions, often resulting in unrealistic movements and frequent violations of real-world physics. One solution inspired by large language models is to align generated outputs with desired outcomes using external feedback. This enables the model to refine its responses autonomously, eliminating extensive manual data collection. In this work, we investigate the use of feedback to enhance the object dynamics in text-to-video models. We aim to answer a critical question: what types of feedback, paired with which specific self-improvement algorithms, can most effectively improve text-video alignment and realistic object interactions? We begin by deriving a unified probabilistic objective for offline RL finetuning of text-to-video models. This perspective highlights how design elements in existing algorithms like KL regularization and policy projection emerge as specific choices within a unified framework. We then use derived methods to optimize a set of text-video alignment metrics (e.g., CLIP scores, optical flow), but notice that they often fail to align with human perceptions of generation quality. To address this limitation, we propose leveraging vision-language models to provide more nuanced feedback specifically tailored to object dynamics in videos. Our experiments demonstrate that our method can effectively optimize a wide variety of rewards, with binary AI feedback driving the most significant improvements in video quality for dynamic interactions, as confirmed by both AI and human evaluations. Notably, we observe substantial gains when using reward signals derived from AI feedback, particularly in scenarios involving complex interactions between multiple objects and realistic depictions of objects falling.


Machine Learning in Industrial Quality Control of Glass Bottle Prints

Bundscherer, Maximilian, Schmitt, Thomas H., Bocklet, Tobias

arXiv.org Artificial Intelligence

In industrial manufacturing of glass bottles, quality control of bottle prints is necessary as numerous factors can negatively affect the printing process. Even minor defects in the bottle prints must be detected despite reflections in the glass or manufacturing-related deviations. In cooperation with our medium-sized industrial partner, two ML-based approaches for quality control of these bottle prints were developed and evaluated, which can also be used in this challenging scenario. Our first approach utilized different filters to supress reflections (e.g. Sobel or Canny) and image quality metrics for image comparison (e.g. MSE or SSIM) as features for different supervised classification models (e.g. SVM or k-Neighbors), which resulted in an accuracy of 84%. The images were aligned based on the ORB algorithm, which allowed us to estimate the rotations of the prints, which may serve as an indicator for anomalies in the manufacturing process. In our second approach, we fine-tuned different pre-trained CNN models (e.g. ResNet or VGG) for binary classification, which resulted in an accuracy of 87%. Utilizing Grad-Cam on our fine-tuned ResNet-34, we were able to localize and visualize frequently defective bottle print regions. This method allowed us to provide insights that could be used to optimize the actual manufacturing process. This paper also describes our general approach and the challenges we encountered in practice with data collection during ongoing production, unsupervised preselection, and labeling.

  Country:
  Genre: Research Report > New Finding (0.46)
  Industry: Materials > Containers & Packaging (0.62)

Viral Costco item sparks mixed reviews -- plus dermatologists reveal how often you should wash your face daily

FOX News

Fans of Costco have joined in on the conversation and are sharing their thoughts about viral glass containers that are available for purchase at the wholesaler. MIXED REVIEWS – Costco shoppers are igniting a conversation about a set of glass storage containers that are reportedly selling out fast. WEDDING DRAMA – Sisters are pitted against each other as the bride wants a "child-free" event. FACING FACTS – How often should you wash your face? If you want to avoid dullness, breakouts, inflammation and irritation, wash your face two times a day for overall skin health.


The Future of Recycling Is Sorty McSortface

The Atlantic - Technology

At the Boulder County Recycling Center in Colorado, two team members spend all day pulling items from a conveyor belt covered in junk collected from the area's bins. One plucks out juice cartons and plastic bottles that can be reprocessed, while the other searches for contaminants in the stream of paper products headed to a fiber mill. They are Sorty McSortface and Sir Sorts-a-Lot, AI-powered robots that each resemble a supercharged mechanical arm from an arcade claw machine. Developed by the tech start-up Amp Robotics, McSortface and Sorts-a-Lot's appendages dart down with the speed of long-beaked cranes picking fish out of the water, suctioning up items they've been trained to recognize. Yes, even recycling has gotten tangled up in the AI revolution. Amp Robotics has its tech in nearly 80 facilities across the U.S., according to a company spokesperson, and in recent years, AI-powered sorting from companies such as Bulk Handling Systems and MachineX has popped up in other recycling plants.


A look inside the lab building mushroom computers

#artificialintelligence

Upon first glance, the Unconventional Computing Laboratory looks like a regular workspace, with computers and scientific instruments lining its clean, smooth countertops. But if you look closely, the anomalies start appearing. A series of videos shared with PopSci show the weird quirks of this research: On top of the cluttered desks, there are large plastic containers with electrodes sticking out of a foam-like substance, and a massive motherboard with tiny oyster mushrooms growing on top of it. No, this lab isn't trying to recreate scenes from "The Last of Us." The researchers there have been working on stuff like this for awhile: It was founded in 2001 with the belief that the computers of the coming century will be made of chemical or living systems, or wetware, that are going to work in harmony with hardware and software.


How Artificial Intelligence Is Revolutionizing the Packaging Industry? - The Data Scientist

#artificialintelligence

Artificial Intelligence is shaping how businesses work and enhancing their capacity to thrive smartly. In recent years we have seen many awe-inspiring developments and super useful too. AI is working in almost every industry, such as food, cosmetics, wood, medicine, etc.; we know that every business requires packaging for their products, which defines the value of the packaging manufacturing industry. Keeping this in mind, AI is playing an impressive role in the advancement of the packaging industry too. Artificial intelligence is transforming the way the packaging industry is working.


OpenPack: A Large-scale Dataset for Recognizing Packaging Works in IoT-enabled Logistic Environments

Yoshimura, Naoya, Morales, Jaime, Maekawa, Takuya, Hara, Takahiro

arXiv.org Artificial Intelligence

Unlike human daily activities, existing publicly available sensor datasets for work activity recognition in industrial domains are limited by difficulties in collecting realistic data as close collaboration with industrial sites is required. This also limits research on and development of AI methods for industrial applications. To address these challenges and contribute to research on machine recognition of work activities in industrial domains, in this study, we introduce a new large-scale dataset for packaging work recognition called OpenPack. OpenPack contains 53.8 hours of multimodal sensor data, including keypoints, depth images, acceleration data, and readings from IoT-enabled devices (e.g., handheld barcode scanners used in work procedures), collected from 16 distinct subjects with different levels of packaging work experience. On the basis of this dataset, we propose a neural network model designed to recognize work activities, which efficiently fuses sensor data and readings from IoT-enabled devices by processing them within different streams in a ladder-shaped architecture, and the experiment showed the effectiveness of the architecture. We believe that OpenPack will contribute to the community of action/activity recognition with sensors. OpenPack dataset is available at https://open-pack.github.io/.